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Abstract—MEGA is a leading cloud storage platform with 
more than 250 million users and 1000 Petabytes of stored data. 
MEGA claims to offer user-controlled, end-to-end security. This is 
achieved by having all data encryption and decryption operations 
done on MEGA clients, under the control of keys that are only 
available to those clients. This is intended to protect MEGA users 
from attacks by MEGA itself, or by adversaries who have taken 
control of MEGA’s infrastructure. 

We provide a detailed analysis of MEGA’s use of cryptography 
in such a malicious server setting. We present five distinct attacks 
against MEGA, which together allow for a full compromise of the 
confidentiality of user files. Additionally, the integrity of user data 
is damaged to the extent that an attacker can insert malicious 
files of their choice which pass all authenticity checks of the 
client. We built proof-of-concept versions of all the attacks. Four 
of the five attacks are eminently practical. They have all been 
responsibly disclosed to MEGA and remediation is underway. 

Taken together, our attacks highlight significant shortcomings 
in MEGA’s cryptographic architecture. We present immediately 
deployable countermeasures, as well as longer-term recommen- 
dations. We also provide a broader discussion of the challenges of 
cryptographic deployment at massive scale under strong threat 
models. 


I. INTRODUCTION 


The cloud — for outsourcing of both computation and data 
storage — has become a very popular approach to address 
scaling and management problems in IT. This applies to both 
enterprise and consumer domains. In the latter case, the market 
offers a myriad of different cloud services, with products 
having different combinations of storage, computation and 
collaboration features, and making a range of security and 
privacy claims. The consumer storage market alone was valued 
at USD 13.6 billion in 2021.! 

As a prominent example, MEGA? is a cloud storage and 
collaboration platform founded in 2013 offering “secure stor- 
age and communication” services. With over 250 million reg- 
istered users, 10 million daily active users [1] and 1000 PB of 
stored data [2], MEGA is a significant player in the consumer 
domain. What sets them apart from their competitors such as 
DropBox, Google Drive, iCloud and Microsoft OneDrive is 
the claimed security guarantees: MEGA advertise themselves 
as “the privacy company” and promise user-controlled end-to- 
end encryption (UCE). 

UCE refers to the fact that data uploaded to the MEGA 
cloud is encrypted, and that only the user who owns the 
data has access to the key (derived from the user’s password) 
needed to decrypt. Thus, MEGA’s main selling point is 


'https://dataintelo.com/report/consumer-cloud- storage-services-market/. 
2https://mega.io/ 


confidentiality of user data even against MEGA themselves, 
as showcased in the following quote from their website [3]: 


“MEGA does not have access to your password or 
your data. Using a strong and unique password will 
ensure that your data is protected from being hacked 
and gives you total confidence that your information 
will remain just that — yours.” 


This implies a threat model in which the service provider 
itself should be considered potentially adversarial, and yet the 
service should remain secure. All the service is then trusted for 
is availability. This adversarial model provides an interesting 
setting for cryptanalysis: not only does the adversary have 
access to encrypted user keys and data, it can also interact 
with users through legitimate channels during steps like user 
authentication and file access. 

This may seem a very strong adversarial model. However, 
we stress that it is consistent with the security claims made 
by MEGA themselves. Moreover, we must consider the pos- 
sibility that even if MEGA is not adversarial, their systems 
may have been compromised by malicious third parties, for 
example nation state security agencies or hacking groups, who 
wish to gain access to users’ data and files. Indeed, the sheer 
size of MEGA -— and the likelihood of it attracting users who 
wish to protect highly sensitive data precisely because of the 
security the service claims to offer — surely make MEGA an 
attractive target. Additionally, UCE ensures that MEGA cannot 
be coerced into revealing user data, e.g. through subpoenas, 
since it is technically unable to do so. 

In this work, we review the security of MEGA in this threat 
model and find significant issues in how it uses cryptography. 
These lead to devastating attacks on the confidentiality and 
integrity of user data in the MEGA cloud. 


A. The MEGA Key Hierarchy 


MEGA’s approach to UCE begins with the user password, 
which acts as the root of the key hierarchy depicted in 
Figure 1. The MEGA client derives an authentication key 
and an encryption key from the password. The authentication 
key is used to identify users to MEGA. The encryption key 
encrypts a randomly generated master key, which in turn 
encrypts other key material of the user. Every account has 
a set of asymmetric keys: an RSA key pair for sharing data, a 
Curve25519 key pair for exchanging chat keys for MEGA’s 
chat functionality, and an Ed25519 key pair for signing the 
other keys. Furthermore, the client generates a new key for 
every file or folder (collectively referred to as nodes) uploaded 
by the user. All keys are encrypted by the client with the 


>» Auth. Key Legend: 
PWe=t i(k lO HKDE 
: Master Ke 
+.» Enc. Key > k y — Encrypt 
ae == Sign 
RSA Share Key | Curve25519_ Ed25519 Node Keys 


PKshare ’ 8Kshare Chat Key << Sign Key kp 


Sym. Chat Keys 


sharing 


Fig. 1. MEGA’s key hierarchy. The master key encrypts the share, chat, sign 
and node keys using AES-ECB. 


master key using AES-ECB and then stored on MEGA’s 
servers to support access from multiple devices. A user on a 
new device can enter their password, authenticate to MEGA, 
fetch the encrypted key material, and decrypt it with the 
encryption key derived from the password. 


B. Trivial Attacks 


The above description of MEGA’s key hierarchy immedi- 
ately leads to a trivial attack. MEGA could mount dictionary 
attacks on user passwords using data revealed in the authen- 
tication protocol (in which the authentication key is sent to 
the MEGA server). This attack is mitigated by users choosing 
strong passwords and by MEGA imposing a suitable password 
policy. 

Moreover, another trivial attack is that MEGA could in- 
troduce malicious code to their web clients. This could be 
used to exfiltrate the user password or keys to the MEGA 
servers. MEGA provides a browser extension — including 
signed updates — which avoids loading code dynamically and 
instead runs it locally. This, to some extent, addresses this 
code integrity issue. 

We do not consider these attacks any further in this paper, 
since they are both mitigated in the MEGA service. 


C. Contributions 


We present a series of five attacks on the key hierarchy 
of MEGA. The first two attacks exploit the lack of integrity 
protection of ciphertexts containing keys (henceforth referred 
to as key ciphertexts), and allow full compromise of all user 
keys encrypted with the master key, leading to a complete 
break of data confidentiality in the MEGA system. The next 
two attacks breach the integrity of file ciphertexts and allow 
a malicious service provider to insert chosen files into users’ 
cloud storage. The last attack is a Bleichenbacher-style attack 
against MEGA’s RSA encryption mechanism. It is applicable 
in a slightly weaker attack model than our first four attacks. 
We have developed proof-of-concept (PoC) implementations 
for all five attacks. We briefly present each attack next. 

1) RSA Key Recovery Attack. Using the session ID ex- 
change at the start of a client connection to MEGA, a 
malicious service provider can recover a user’s private 
RSA share key (used to share file and folder keys) over 
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512 login attempts. When the RSA key has been com- 
promised by the attacker, the confidentiality and integrity 
of all node keys shared with the victim is lost. Our attack 
exploits the lack of integrity protection of the encrypted 
RSA key and properties of the RSA-CRT implementation 
used by MEGA clients to build an oracle that leaks one 
bit of information per login attempt about a factor of the 
RSA modulus. It combines this novel attack vector with 
known lattice techniques to accelerate the attack. 
Plaintext Recovery Attack. Building on the previous 
vulnerability, the malicious service provider can recover 
any plaintext encrypted with AES-ECB under a user’s 
master key. This includes all node keys used for encrypt- 
ing files and folders (including unshared ones not affected 
by the previous attack), as well as the private Ed25519 
signature and Curve25519 chat key. As a consequence, 
the confidentiality of all user data protected by these 
keys, such as files and chat messages, is lost. This 
attack exploits MEGA’s reuse of the master key and the 
use of RSA-CRT, in combination with the abilities of 
the adversary to manipulate key ciphertexts and choose 
plaintexts used in the authentication protocol. We believe 
it to be an entirely novel kind of attack. 

Two Integrity Attacks. We present two attacks with 
which a malicious service provider can break the integrity 
of the file encryption scheme and insert arbitrary files into 
the user’s file storage which pass the authenticity checks 
during decryption. This enables framing of the user by 
inserting controversial, illegal, or compromising material 
into their file storage. The attacks are non-trivial because 
the adversary cannot properly encrypt node keys without 
access to the user’s master key. 

The first attack uses the previous plaintext recovery attack 
to obtain a suitable node key and then constructs an 
encrypted file. The user cannot demonstrate that they did 
not upload the forged data because the files and keys are 
indistinguishable from genuinely uploaded ones. 

The second integrity attack does not require that the 
attacker has access to a decryption oracle for AES-ECB 
under the master key. Instead, it exploits a fundamental 
problem with the method used by MEGA to “obfuscate” 
file and folder keys before encryption. It needs only 
knowledge of a single AES block and its AES-ECB 
encryption under the user’s master key to create a forgery 
that is difficult to detect. 

RSA Decryption Attack. The RSA encryption used 
to exchange chat keys as a legacy fallback is vulner- 
able to a novel variant of Bleichenbacher’s attack on 
PKCS#1 v1.5 padding [4]. This attack allows for the 
decryption of RSA ciphertexts containing chat and node 
keys. This is already implied by the key recovery attack 
in point 1, but this attack requires a weaker adversary 
model (which we describe in detail in the sequel). This 
attack is challenging to perform in practice as it would 
require almost 27 client interactions. Nevertheless, it ex- 
poses an entirely independent attack vector and uncovers 


additional issues in MEGA’s cryptographic design. 

In addition to these technical contributions, we propose 
countermeasures to protect against the attacks, as well as a 
discussion of more general learnings about the challenges of 
deploying and maintaining cryptography at scale. 


D. Ethical Considerations 


We contacted MEGA to inform them of the vulnerabilities 
in their system and to suggest three different levels of mit- 
igation (immediate, minimal, and recommended) on March 
24, 2022. We suggested a 90-day disclosure period. MEGA 
acknowledged the attacks on March 24, 2022, confirming that 
the system is vulnerable and needs patching. They decided 
to introduce additional client-side checks on the format of 
RSA private keys to protect against our first attack. While 
these checks directly prevent the RSA key recovery attack, 
and hence by extension the attacks that depend on it, this 
fix significantly differs from our proposed countermeasures. 
MEGA released their patches on June 21, 2022, and awarded 
us a bug bounty. 

We only worked with our own test account when exploring 
MEGA’s services and building our PoCs. We avoided over- 
loading MEGA with login requests when running our attacks. 
We did not attempt to reverse engineer any MEGA server-side 
code, but instead relied on MEGA’s whitepaper [5], inspection 
of client-side code, and the public server API. 


E. Related Work 


This paper is based on Haller’s master thesis [6]. 

1) Previous Attacks on Cloud Storage Systems: Dal- 
skov and Orlandi [7] performed an in-depth analysis of 
SpiderOak One, another provider with user-controlled encryp- 
tion, in a threat model similar to ours. They uncovered several 
vulnerabilities in the cryptographic design of SpiderOak One 
which led to a breach of data confidentiality. Niehage [8] 
discovered four attacks on Nextcloud. The first vulnerability 
exploited that the server could maliciously replace public keys 
due to the lack of integrity protection. The other three attacks 
break file integrity by modifying files partially, replacing them, 
or performing a downgrade attack on the encryption. 

2) Related Cryptographic Attacks: Previous results on key 
recovery through over-writing of key material targeted RSA 
in the context of OpenPGP [9]. The focus was on signatures 
instead of encryption with only partial output. A more sys- 
tematic analysis of the impact of key over-writing attacks on 
OpenPGP was recently given in [10]. Power fault attacks on 
RSA signatures [11], [12] inspired our attack on RSA-CRT, 
however, we tamper with the private key instead of inducing 
errors in single computations. Prior work on authenticated 
encryption without key commitment constructs ciphertexts that 
can be decrypted to two valid files using different keys [13], 
[14], [15]. While they share the setting of AES primitives with 
known keys, we only consider a single key for our integrity 
attacks and target CBC-MAC instead of encryption. 

We contribute an RSA decryption attack that is a novel 
variant of Bleichenbacher’s attack on PKCS#1 v1.5 from 


1998 [4]. Other instances of this attack [16], [17], [18], [19], 
[20] exploit different side-channel leakage to build padding 
oracles. Unlike them, we do not target PKCS#1 v1.5 padding 
but the custom padding scheme of MEGA that includes an 
unknown prefix circumventing the straightforward adaption of 
Bleichenbacher’s attack. 

3) Proposals for Cloud Storage Systems: A high-level sur- 
vey of the functionality and security of multiple cloud storage 
providers is given in [21]. Messmer et al. [22] provide a 
generic security model for a simplified cloud file system. Boyd 
et al. [23] give models for secure cloud storage, including 
consideration of a compromised service provider. Kamara and 
Lauter [24] survey architectures for secure cloud storage with 
a trusted client and an untrusted service provider. Metal [25] 
and Titanium [26] are recent proposals for hiding metadata (as 
well as data) in encrypted file-sharing systems. These papers 
are part of a rich academic literature on secure cloud storage 
and collaboration systems. 


F. Paper Organization 


The next section gives a self-contained description of 
MEGA and its use of cryptography. The ensuing sections 
present our five attacks. Section VII describes our PoCs. 
Section VII describes countermeasures to our attacks. Sec- 
tion IX discusses the wider implications of our work and future 
directions. 


II. THE MEGA CLOUD 
A. Notation 


By [m]x we denote the encryption of a message m with 
the key k. The encryption algorithm can be derived from the 
context. We let J[a:b] denote the slice (€a, €a41,---, ep) from 
the tuple 1 < (e1,¢€2,...,€n), where 1 << a<b< n. We 
treat byte strings as tuples of bytes. We use [a,b] to denote 
the integer set {a,a+1,...,b}. B is shorthand for byte. By || 
and © we denote string concatenation and XOR, respectively. 


B. Key Hierarchy 


At the root of a MEGA client’s key hierarchy, illustrated 
in Figure |, is the password chosen by the user. From this 
and a client-chosen salt, two 128-bit keys are derived using 
PBKDF2-HMAC-SHA512: an encryption key ke and an authen- 
tication key k,. Additionally, the client generates a 128-bit 
master key kyy, which is encrypted with k. using AES-ECB 
and uploaded to the server. Below the master key in the 
hierarchy reside the node keys:* a set of symmetric AES keys 
used to encrypt user files and folders (nodes). A fresh node 
key is generated for each node object created by the user. 
Moreover, each user has three asymmetric key pairs: 

e “Share key”: an RSA key pair for sharing node keys (and 

as a fallback solution for chat key transfer) 

e “Chat key”: a Curve25519 key pair for exchanging keys 

for the MEGAchat 


3Sometimes referred to as data encryption keys in other settings. 


EncNode(km, F, attr): 


Given: master key ky, node F, attributes attr 
Returns: encrypted file chunks [F']x,, attributes [attr]x,, 
obfuscated key [k?”"]i,, 
1 kp <s {0,1}178 
2 Np <s {0,1}%4 e nonc¢ 
3 Ip + 2 for j € [10, 13] EX 
4 F\||Foll...||Pn <— F 
5 For all 2 € [1, n): 
6 [Filkp, Ti < AES-CCM*.Enc(kp, Nr|| (i: Ir), Fi) 
Cr © [Filapl|[Falaell --- [| [Palie 
8 Toond <— CBC-MAC. Tag(kr, T; || T2]| aor || Tn) 
» ke! — ObfKey(kr, Nr, Teona) 
10 [ke"Jiqy < AES-ECB.Enc(ku, kg?") 
i1 [attr], <_ AES-CBC.Enc(kp, iv := 0°, attr) 
12 return Cp, [attr]x,, [Ke Jka, 


Fig. 2. MEGA’s chunkwise file encryption procedure. 


AES-CCM"*.Enc(kr, IV, F:): 


Given: node key kp, file chunk F;, initialization vector [V 
Returns: encrypted file chunk c;, authentication tag T; 
1 Nr + IV[1:8] // extract file -e from leftm 
2 Ti; — CBC-MAC. Tag(kr, iv := Ne|| Nr, F:) 
3 ¢; + AES-CTR.Enc(kp, IV, Fi) 
4 return c;, T; 


Fig. 3. MEGA’s custom AES-CCM implementation. 


e “Sign key”: an Ed25519 key pair for signing the other 
public keys 
The private keys and the node keys are encrypted with 
AES-ECB under the master key and the resulting ciphertexts 
are stored by MEGA’s servers. 


C. Node Encryption 


To encrypt a file F’, the client first samples a random 128- 
bit node key kp and a 64-bit nonce Nr. Large files are then 
partitioned into chunks F; of size between 128 KB and 1 MB.* 
Consequently, there are between 21° and 2'° AES blocks per 
chunk, each 128 bits large. The chunks are encrypted with a 
custom implementation of AES-CCM with key kp and nonce 
Nr. Additionally, the file attributes attr (containing metadata 
such as the filename) are encrypted using AES-CBC with a 
zero IV and key ky. The full node encryption procedure is 
shown in Figure 2. For folders, which do not have file content, 
the file input F' is ignored and only the attributes are encrypted. 

Figure 3 describes AES-CCM*, MEGA’s variant of 
AES-CCM. This deviates from the specification in RFC 
3610 [27],° which invalidates the formal security analysis 
in [29]. However, we did not find attacks on AES-CCM*. 


4Clients usually select a single chunk size and use it for all chunks. 

5The CBC-MAC tag of the plaintext chunk F; is computed using the IV 
Nr||Nr, rather than the zero bytes string (cf. RFC 3602 [28]). Furthermore, 
the CBC-MAC tag is returned in the clear, rather than encrypted with the 
first key stream block from the counter mode encryption. This means that 
MEGA’s variant of AES-CCM is effectively an Encrypt-and-MAC scheme, 
rather than MAC-then-Encrypt. 


ObfKey(kr, Nr, Tcond): 


Given: node key kp, file nonce Nr, condensed MAC Trond 
Returns: obfuscated file key 

1 71||72||73||T4 — Teona 

2 meta *— T1 © T2||T3 @ T4 

3 x kp @ (NF|| Teta) 

4 return x||Nep||Tincta 


DeobfKey(ke”"): 


Given: obfuscated file key ke?! 

Returns: node key kp, file nonce Nr, metamac Tineta 
5 x||Ne||Timeta < ker 
6 keex® (NF|| Teta) 

return kr, Nr, Tyncta 


Fig. 4. MEGA’s key obfuscation and de-obfuscation procedures. 


To finish the node encryption in Figure 2, the client ag- 
gregates the file chunk MACs T), To, ..., T, into a single 
condensed tag value Tong using CBC-MAC with the key kp 
applied to the concatenation of all chunk MACs. Additionally, 
the client computes an “obfuscated file key”, k??’, from the 
node key kp, nonce Np, and condensed tag Teong. This 
keP! is then encrypted with AES-ECB under the master key 
and uploaded to MEGA’s servers together with the encrypted 
attributes and file chunks. Note that no MAC tag is computed 
for the attributes, implying a lack of integrity protection. 


D. Key Obfuscation 


MEGA applies an obfuscation procedure to the node key, 
nonce and MAC tag before they are encrypted and uploaded to 
the server. The obfuscation, described in Figure 4, aggregates 
the condensed MAC Tyona to a so-called “metamac” Teta by 
splitting Tyong into four 4-byte chunks and XORing together 
the first two chunks, as well as the last two chunks. The key 
kp is then XORed with the concatenation of Ne and Tyneta- 
The final obfuscated key k@?! is obtained by appending Nr 
and Timeta to the scrambled file key. 

Unfortunately, MEGA provides no reasoning for the design 
of the key obfuscation. We hypothesize that the aim is to create 
a binding between the involved components. However, as we 
show in Section V, the structure introduced by ObfKey leads 
to attacks on the integrity of node ciphertexts. 


E. Authentication and Session ID Exchange 


When a user logs into their MEGA account, the client 
derives the authentication key k, from the password and sends 
it over a TLS connection to the server for authentication. The 
server compares the first 128 bits of the SHA-256 hash of 
k, to a value stored during registration, indexed by the user’s 
email address. If these values match, the server generates a 
session ID (SID) sid and pads it with two bytes to the left 
and 211 bytes to the right.° Let (pkshare, Skshare) be the 2048- 
bit RSA key pair of the user and let m be the padded sid. 


©The exact padding scheme is not published by MEGA. However, the need 
for compatibility with the client-side decryption determines the position of 
sid in the encoded string, which suffices for our attacks to work. 


DecSid([skghare View» [7] pknare) 


Given: encrypted RSA private key [sk¢peo’e? 
message [M1] pigpan 
Returns: decrypted and unpadded SID sid’ 
1 skgpeoded ¢_ AES-ECB.Dec(kw,, [skopeet*“]:.,,) 
2 N,e,d, p,q, dp, dg, u + DecodeRsaKey (skort?) 
2 my & ([21]pigrare)? mood p 
a my < ([m] pkehare )“? mod q 
5 t + mi, — mj mod p 
5 het t-umodp 
im’ + h-qt+m 
sid’ + m’[3:45] 
return sid’ 


]k\» encrypted 


Fig. 5. SID decryption during MEGA’s client authentication using RSA. 


After generating the SID, the server encrypts m with pKshare 
and sends it to the client, together with the encrypted version 
of skshare- 


Figure 5 provides an in-depth description of the client 
side decryption of the encrypted private RSA key and SID. 
First, the client decrypts the RSA key, which is encoded for 
RSA-CRT as follows: 


skghare  I(q)llalll(p)IIPIlI(d)||d||I(u)|]al|P. 


Here, q and p are the two 1024-bit prime factors of the RSA 
modulus N, d is the secret exponent, and u < q~! mod p 
is an additional value useful for the RSA-CRT decryption 
described below. P is padding and I(x) denotes the two- 
byte big-endian length encoding of the byte-length of x € 
{p,q,d, u}. For 2048-bit RSA, the encoding consists (with 
high probability’) of three 128 B elements (q, p, and u) and 
one 256 B part d. Since AES blocks are 16 B, this results in 
41 blocks in total, where the last block contains the eight byte 
padding P. 


Next, DecodeRsaKey parses sk°?¢0¢ed into components and 
calculates d, <- (d—1) mod p and d, + (d—1) mod q. The 
client only sanity checks the length encodings to ensure that 
the result is split into exactly four parts. Neither the padding 
nor the lengths of the individual components are verified. No 
integrity check is performed during decryption since the key 
is encrypted using AES-ECB. After decoding the private key, 
the client performs RSA-CRT decryption of the encrypted SID. 
Lines 3 and 4 of Figure 5 decrypt the padded session ID m 
in the smaller rings Z, and Zg. Lines 5—7 recover the padded 
SID m’ using Garner’s formula [30]. Instead of checking the 
padding, line 8 uses the known SID length to truncate m’ to 
sid’. Before truncation, m’ is left-padded with zero bytes until 
its length matches the byte length of N. In a correct execution, 
we have sid’ = sid. The client sends sid’ in subsequent 
requests to the server to complete the authentication. 


TInverses modulo a random x-bit number are approximately uniformly 
distributed and, thus, have close to x bits with high probability. 


F. MEGAchat 


In addition to the cloud storage service, MEGA provides 
the end-to-end encrypted chat messaging service MEGAchat. 
The chat messages are encrypted with AES-CTR using 16- 
byte keys which are periodically rotated. The sender generates 
these symmetric chat keys and encrypts them for the recipient 
using the user’s long-term asymmetric Curve25519 key. If the 
Curve25519 public key is not available,® the sender uses the 
RSA-2048 share keys to encrypt the symmetric chat keys as 
a fallback option. 


Ill. RSA KEY RECOVERY 


In this section, we present a practical attack to recover a 
user’s RSA private key by exploiting the lack of integrity 
protection of key ciphertexts. By tampering with the encrypted 
RSA private key, a malicious server can deceive the client 
into leaking information about one of the prime factors of 
the RSA modulus during the session ID exchange. More 
specifically, the session ID that the client decrypts with the 
mauled private key and sends to the server will reveal whether 
the prime is smaller or greater than an adversarially-chosen 
value. This information enables a binary search for the prime 
factor, with one comparison per client login attempt, allowing 
the adversary to recover the private RSA key after 1023 
client logins. The number of required login attempts can be 
reduced from 1023 to 512 by implementing a lattice-based 
optimization, which allows an attacker to terminate the search 
early and recover the remaining missing bits of the prime. 


A. Threat Model 


We assume a malicious service provider (that is, the adver- 
sary controls MEGA’s core infrastructure). 


B. Attack Description 


The attack begins with a key overwriting step, in which 
the attacker exploits the lack of integrity protection of key 
ciphertexts to modify the client’s outsourced RSA private key. 
The resulting key is altered in the last part of the encoding 
before the padding, such that it contains u’ 4 u = q~! mod p. 
When the client uses the thus modified private RSA key to 
decrypt a ciphertext [m]px,,,,. Containing a message m chosen 
by the adversary, the result leaks information about whether 
m < q or m > q, where q € [2193, 21924 _ 1] is one of the 
prime factors of the RSA modulus N. 

This case distinction oracle arises due to properties of 
RSA-CRT, and allows the adversary to perform a binary 
search for q, by choosing m such that the search interval is 
halved by each client decryption. To make the client decrypt 
messages of its choice, the adversary uses the session ID sent 
from the server at the start of each session. That is, instead 
of choosing a SID the way the server would, the attacker sets 


8Comments in the source code suggest that accounts created before 2016 
did not have a long-term asymmetric Curve25519 key pair. For contacts 
added before Curve25519 keys were introduced, the sender may not yet 
have updated the record of public keys for the recipient and therefore lack 
the Curve25519 public key. 


m to the middle value of the remaining search interval for q 
and sends [mJ] pk.),,., the encryption of m, in the place of the 
encrypted SID. Once one factor q has been determined, the 
adversary can easily recover the remaining private key. 

Next we describe the details of each step of the attack. 

1) Key Overwriting: First, [sk¢ned@4],,., is modified such 
that the resulting decrypted RSA private key contains a differ- 
ent u-value than the original private key encoding. Recall that 
u is a 128 B value spanning blocks 33-41 of skopeode?, with 
partial coverage of blocks 33 and 41. Since the encoded key is 
encrypted with AES-ECB, it can be altered at block granularity 
by modifying the corresponding ciphertext block. Hence the 
desired modification can be achieved by applying any non- 
identity transform to one of ciphertext blocks 33-41. Although 
the resulting value is unknown, it suffices for the attack that the 
client recovers u’ 4 u:= q~! mod p. In our implementation 
of the attack, we modify the second to last ciphertext block 
of [skencoded), | to maintain correct length encodings and to 
avoid garbling the padding. The former ensures that the client’s 
decoding succeeds. The latter increases the robustness of the 
attack in case client versions that we did not analyze were to 
perform a format check on the padding. 

2) RSA-CRT Case Distinction Oracle: In the second step 
of the attack, the adversary chooses a message m, encrypts 
it to [m]px,,,, and sends to the client in the place of the 
encrypted SID. When the client uses the modified RSA private 
key sk‘) ,,¢ 0 decrypt [mJpx.,,,.» the result allows the attacker 
to distinguish the case m < q from m > q. We analyze 
each case separately to show how the oracle arises. A slight 
challenge — and notable difference to previous work on fault 
attacks — is that the adversary only sees part of the result of 
the decryption, due to the unpadding performed by the client.’ 

Case m < q. In this case, [mJpx,,,,, correctly decrypts to 
m, despite the fact that the modified key sk‘,,., is used in 


place of skshare. To see this, first note that if m < gq, then 
/ 


m, = m, because by the Chinese remainder theorem (CRT) 
m =, m,, and since m < q we have m mod q = m. For m/,, 
we again have by the CRT that m =, iat Therefore, there 
exists a € Z such that m = mi, + a+ p. Combining these 
observations on my and at, we have on Line 5 of Figure 5 
that t ~ —a-p mod p = 0. Therefore, h = 0, independent of 
the value of u’. In other words, the client recovers the correct 
result m’ <~ h-q+mj, = m despite the modified u’ value. This 
results in sid’ = 0 because m < q < 2!?4, (Recall that q is 
a 1024-bit prime.) The client left pads the decrypted m’ with 
zero bytes to 256 B and then removes the rightmost 211 B. 
Therefore, m is hidden in the padding, and the client recovers 
and uses sid’ = 0. 

Case m > q. In this case, the message m will not be 
correctly decrypted, allowing the adversary to detect that the 
value returned by the client is different from the one sent. To 
see this, we consider each step of the RSA-CRT decryption 


procedure again. 


Recall that in an honest execution, the plaintext m contains the padded 
session ID, which the client truncates to 43 B before returning it to the server. 


By definition, there exist a, 8 € Z such that mi, =m-—a:-p 
and m/, = m—-q. Then, t <- m,—m, mod p = 8-q mod p. 
Since p and q are coprime, t 4 0 iff gcd(8, p) = 1. This hap- 
pens with overwhelming probability 1—1/p for primes chosen 
uniformly at random. Thus, with high probability (w.h.p.), 
h<t-u' mod p 4 0 and m’ <— h-q+m{, 4 m. Although 
m! =q mj, we have m' #, mi, because u’ 4 q~! mod p and, 
therefore, Lines 5—7 of Figure 5 no longer apply the CRT. 
We observe that m’ > 2567"! w.h.p. because of the summand 
h-q. The integers h and q are random numbers of approx. 1024 
bits. Therefore, h- q has approx. 2048 bits. Thus, w.h.p., m/’ 
is larger than 256!1!, giving sid’ 4 0. 

In conclusion, despite the truncated decryption oracle, the 
adversary can distinguish whether q < mor m > q with 
overwhelming probability based on whether the session ID 
sid’ returned by the client is 0 or not. 

3) Binary Search: Using the RSA-CRT case distinction 
oracle, the adversary can perform a binary search for the 
RSA prime factor q. For each login attempt by the victim, 
the adversary chooses m to the middle value of the remaining 
search interval for q and then uses m instead of the padded 
SID. Note that due to the lax padding format used by MEGA, 
clients accept any integer m € [0, N — 1] as a valid padded 
SID. Once the RSA factor q has been determined this way, 
the adversary can easily recover the remaining private key as 
p+ N/qand d+ e~! mod (q—1)(p—1). 


C. Impact 


A compromised RSA private key allows the adversary to 
break the confidentiality of all files shared with the victim by 
other MEGA users. Furthermore, chat keys that are exchanged 
using the RSA public key as a fallback can be compromised. 
Of even higher interest than the direct consequences of the 
recovering the RSA key, however, is the impact of this attack 
on the overall security of MEGA’s key hierarchy. The attacks 
described in Sections [TV and V show how the confidentiality 
and integrity of user data can be broken by chaining this attack 
with other vulnerabilities. 


D. Complexity and Optimizations 


Without optimizations, the attack requires 1023 queries due 
to the binary search on an interval of the size 2'°?%. In the 
MEGA web client, a user needs to perform one login (ie., 
enter their password) for every query. If the adversary is the 
service provider, it can stealthily mount the attack by accepting 
any SID returned by the victim. In this case, the client may 
still fail later due to the garbled RSA private key. However, 
decryption errors caused by the faulty key are not always 
exposed to the user; on some occasions, the client simply 
removes and re-fetches the private key. 

We briefly discuss how the number of required login re- 
quests can be reduced to make the attack faster in practice. 

1) Lattice-Based Optimizations: When the most significant 
bits of the RSA prime factor have been recovered using 
the technique described above, the remaining bits can be 
determined without further interaction with the client by using 


lattice cryptanalysis. This allows the binary search to be 
terminated early, requiring fewer login attempts from the user. 
In Appendix A we describe the straightforward application 
of a low-dimensional lattice attack adapted from Gabrielle 
and Heninger [31] to the setting of our RSA key recovery 
attack. This approach recovers up to 341 unknown bits and 
therefore reduces the required number of queries for the attack 
from 1023 to 683. With more complex lattices described 
by Howgrave-Graham [32] and May [33], it is possible to 
recover up to 512 unknown bits (i.e., the attack needs only 
512 queries). 

2) Malicious Client Modifications: MEGA’s current web 
clients store user key material and the SID in the browser’s 
local storage by default. Users remain logged in, and the 
browser can access this key material without the need for 
the user to enter their password. We remark that a malicious 
provider could easily modify current clients to re-establish a 
new SID in the background instead of storing it locally. With 
such a seemingly harmless code change, the adversary could 
perform queries whenever the user accesses the MEGA cloud 
storage, entirely unbeknownst to the user. 


IV. PLAINTEXT RECOVERY ATTACK 


As we have seen in the previous section, MEGA’s authen- 
tication protocol can be used as an oracle to recover a user’s 
private RSA share key skgnare, even though it is encrypted 
with the user’s master key using AES-ECB. MEGA encrypts 
the chat, sign, and node keys in the same manner, reusing 
the master key and without adding integrity protection for any 
of the encrypted keys. This leads to the natural question of 
whether, having recovered skghare, the adversary can go on to 
also recover the other keys. 

In this section, we show that this can be done: after inserting 
target AES-ECB ciphertext blocks encrypted under the master 
key kg into the AES-ECB ciphertext for sk,pare and choosing 
the SID in a special way, the session ID returned from the 
client during authentication leaks the corresponding plaintext 
blocks. For technical reasons explained below, this method can 
be used to recover up to two plaintext blocks for each run of 
the authentication protocol. 


A. Threat Model and Adversary 


Our threat model considers an adversary that controls 
MEGA’s servers. Additionally, we assume that the adversary 
knows the client’s RSA private key. For instance, it can recover 
this key by running the RSA private key recovery attack from 
Section III or performing a forensic investigation of the user’s 
unattended device (e.g., dumping the memory). 

The adversary aims to decrypt two (not necessarily consec- 
utive) ciphertext blocks ct;,ctz € {0, 1}12° that are encrypted 
with AES-ECB under the master key ky,. 


B. Attack Description 


The attack consists of three steps: key overwriting, simpli- 
fying RSA-CRT and recovering the plaintext. 


1) Key Overwriting: Recall from Section II-E the encoding 
of the RSA private key exchanged during client authentication: 


skghare ”  !(q)|lall!(p)||PIICd) ll al|l(u) |lul|P 


The client encrypts this key using AES-ECB under the master 
key ky, before uploading [sk¢?¢e4°4],., to MEGA. The adver- 
sary can modify this ciphertext at AES block granularity since 
AES-ECB encrypts blocks independently. The [sk¢?¢e¢°4],,, 


share 


can be split into 41 AES ciphertext blocks c;, c2,..., C41: 


C1||c2|| .-- llega iseseees 


‘share kx ’ 


where |c;| = 16 bytes for all i € [1,41]. Of particular interest 
to the attack are the last 9 blocks, which contain the encryption 
of u.!° Specifically, block c33 encrypts the concatenation of 
the last 6 bytes of d, the length-encoding I(u), and u[1:8] (the 
first 8 bytes of u). Blocks c34—c4; contain the ciphertext for 
u([9:128] including 8 bytes of trailing padding. 

For the plaintext recovery attack, we avoid modifying c33 
to preserve I(u) and enable successful client-side key parsing. 
Instead, we overwrite c34 and c35 with the target ciphertext 
blocks ct; and ct. That is, the adversary sends the following 
tampered encryption of the RSA key to the client. 


kencoded 


[SkSnore lk < C1ll «++ Ile3s||etz||ctal|esell . .- \lear, 


This replacement results in a new RSA private key component, 
u’, which can be split into three parts u;||x|lu2 < uw’, 
where uy = u[1:8], x is the decryption of ct,||ctz, and 
uz = u[41:128] is the remaining preserved plaintext bytes of 
the original u. The adversary aims to recover x, which replaces 
the 32 bytes u[9:40] of u in wu’. 

2) Simplifying RSA-CRT: To recover x, we leverage that 
the RSA-CRT decryption of the session ID with the modified 
sk are uSes u’. By assumption, the adversary knows the 
original skshare, including u; and uz. By replacing the session 
ID with a specific message m, the RSA-CRT decryption can 
be simplified to enable the recovery of x. 

The adversary chooses m < u- q as the “session ID” and 
encrypts it using the user’s RSA public key.'! The adversary 
sends [skencodedy) and [mpk.,.,. to the client, which runs 


the RSA-CRT SID decryption described in Figure 5. The 
particular choice of m gives 


& ([M]pkinse)” mod p = u- qmod p = 1 and 
& ([m] pknue)"* mod q = u-qmod q=0. 


This computation is not affected by the modified private key 


since it only uses p, q, and d. Furthermore, m, = 1 and 


my, = 0 lead to t = 1 and h = uw’ mod p. Consequently, the 
decryption of [m]px,,,,. Simplifies to m’ = h- q. 
We now argue that with high probability, m’ = wu’ - q. If the 


'0Tn the following, we focus on the case where the private exponent d is 
256 bytes and u is 128 bytes to simplify the analysis. This means that u spans 
blocks 33-41. It would be straightforward to adapt our attack to corner cases 
with shorter encodings of d or u. 

'l Although m does not have the expected form of a padded SID, the client- 
side processing tolerates any message (cf. Figure 5). 


adversary were given m’, it would be easy to recover u’, and 
thereby x, by computing u’ ¢ m'/q. We discuss later how 
the adversary can still recover x although it only receives 43 
bytes of m’. First, for m’ = u’- q to hold, we need that 
u' mod p = uw’, ie., uo < p. To see that this is the case 
w.h.p., recall that u < p by definition. Split the prime p into 
pi||p2 < p, where |p;| = |uz| = 8 B and |p2| = 120 B. By 
construction, u and wu’ both start with u;. Since u < p, it can 
only be the case that u’ > p if u; = p;. Thus, we derive the 
following lower bound on the probability. 


Pr[u’ < p| > 1—Pri{u = pi] =1- 24" >1-27-8 
Hence, m’ = uw’ - q with probability at least 1 — 2 

3) Recovering Plaintext: We show how an adversary can 
recover x from the truncated SID m’[3:45]. We assume that 
u’ < p such that m’ = uw’ - q. Let y)||y2||y3 <—_ m’, where 
y, is the removed 2-byte prefix, yz is m’[3:45], and y3 is a 
211-byte unknown suffix. To recover x, the adversary tries all 
possible prefix values y; € {0,1}1° and performs arithmetic 
operations on ¥;||y2||v3 to get u||x. The correct prefix guess 
yt can be detected when the result starts with u;,.'* 

To compute u,||x, interpret the involved byte strings as big- 
endian encoded integers. Then, from m’ = u’-q, m! = y7|lya- 
25674 + y3, and u’ = uw) ||x - 25685 + us, we obtain 


Villy2 - 256°" 

q 
The last term is bounded by — < 2°°°. Therefore, at least 39 
bits separate the prefix u;||x from Pe Thus, the subtraction 
of “ can only affect x if u’ has the prefix uy||x||0°?. 
This happens with a probability of about 2~%° since u’ is 


approximately uniformly distributed. Hence, with probability 
1 O58? 


—63 


= uy||x - 25688 + uz — ze 


uz||x = 


ee —S 256-% | ; 
q 


where uy is rounded away because it is smaller than 256°°. 


C. Complexity 


The adversary recovers the two plaintext blocks (32 B) 
corresponding to ct;, ctz with a probability of at least 1—2~°9, 
given that u’ < p. Let S be the event that the attack is 
successful. Then, the overall success probability is 

Pr[S] = Pr{S | uw’ < p]- Pr[w’ < p| > 1—2-*. 
The adversary needs to iterate over the 21° prefixes ¥, which 
is computationally trivial and does not involve any interaction 
with the victim. Additionally, a single login attempt of the 
user is required to perform the attack. Since each successful 
instantiation of the attack recovers two AES-ECB plaintext 
blocks, one query suffices to recover a full node key or 32 


!2This method has a small false positive probability of approximately 2— 6+ 
(since |u;| = 64). However, this can be avoided in practice using additional 
information to detect when the the correct plaintext x has been recovered 
(e.g., by attempting to decrypt a file using x as a key). 


bytes of asymmetric key material. The attack cannot be used 
to decrypt three or more blocks per login attempt because then 
the prefix u;||x is changed by the unknown term a with high 
probability. However, the adversary can iterate the attack to 
recover as much plaintext as it desires. 


D. Practical Considerations 


In practice, the web client often executes a special case! 


(not shown in Figure 5), where only a single prefix byte is 
removed during SID decryption instead of two. The above 
attack and analysis are straightforward to adapt to this case. 
The adversary iterates over the 2° values y,; € {0,1}° and 
recovers the prefix u; ||x with probability 1—2~°". The overall 
success probability is Pr[S] > 1—2~°°, and the computational 
cost is slightly lower. 


V. INTEGRITY ATTACKS 


Having successfully recovered the node, share, chat, and 
sign keys of any MEGA user using the attacks in the preceding 
sections, it is clear that at this point all confidentiality of user 
data is lost. We now turn our attention to integrity, to see 
what guarantees MEGA’s system gives users in terms of file 
authenticity. The result is — perhaps unsurprisingly — that after 
recovering node keys, very little protection remains: access to 
the keys means that the adversary can trivially modify existing 
files by decrypting, changing, and then re-encrypting the files. 
More interesting is the ability to add new files to the user’s 
storage, without relying on existing node keys. We present two 
versions of this stronger type of attack. 

In the first, an attacker can create a node key ciphertext 
by choosing any two AES-ECB ciphertext blocks, and use 
the plaintext recovery attack from Section IV to decrypt the 
obfuscated key. It may then use the knowledge of the resulting 
key, nonce and metamac to forge a valid file ciphertext for a 
plaintext of its choosing, up to one AES block. This enables a 
framing attack on the victim, who will not be able to provide 
cryptographic evidence that they did not upload the forged file. 

The second attack exploits the structure of MEGA’s ob- 
fuscated node keys to create a key ciphertext for the all 
zero key: by repeating a ciphertext block the adversary can 
ensure that the client derives the key kp = 0178. This attack 
is less surreptitious than the framing attack because of the 
low probability of the all-zero key appearing in an honest 
execution. In return, it does not rely on the ability to decrypt 
arbitrary AES-ECB ciphertexts; a single known plaintext- 
ciphertext AES block pair suffices. With a known key, the 
attacker can forge a valid ciphertext for a chosen plaintext. 


A. Threat Model 


The threat model considers an adversary controlling 
MEGA’s core infrastructure. The first attack assumes access 
to a decryption oracle Oge. for AES-ECB encryption under 
the master key ky. This oracle can be realized by exploiting 
the attack in Section IV, for example. 


13This special case is unlikely to occur during normal operation but happens 
with probability 1 — 2—8 for our choice m = u- q for the SID. 
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Fig. 6. Reconstruction of a mostly chosen file using a known node key kp and nonce Nf that produces a fixed metamac Tineta- 


The second attack requires knowledge of a single plaintext- 
ciphertext pair (pt, ct) such that ct = AES-ECB.Enc(kw, pt). 
This can again be obtained from the plaintext recovery attack 
in the preceding section, but can also be acquired in other 
ways. For instance, the attacker can use MEGA’s protocol 
for public file sharing to obtain the pair. When a user shares 
a file or folder publicly, they create a link containing the 
obfuscated node key in plaintext. Hence, a malicious cloud 
provider who obtains such a link knows both the plaintext 
and the corresponding ciphertext, since the latter is uploaded 
to MEGA when the file is created (before being shared). 


B. Attack Description 


We first describe the two approaches to derive key material 
for the file forgery, based on the assumed adversarial resources. 
Apart from how the key is derived, both attacks then use 
the same procedure to construct a file and corresponding 
ciphertext which will pass all integrity checks. 

1) Decryption Oracle: In this scenario, the adversary has 
access to an AES-ECB decryption oracle Ogee. To create 
a key ciphertext for which the attacker knows the plaintext 
obfuscated key, it proceeds as follows. The adversary selects 
two AES ciphertext blocks ct;,ctz € {0,1}??° uniformly at 
random. Next, it uses the decryption oracle to recover the 
corresponding plaintext blocks pt;||ptz < Odec(cti||ct2). 
It then runs the key deobfuscation algorithm described in 
Figure 4 to obtain the node key, nonce, and metamac: 
kp, Ne, Tmeta <- DeobfKey(pt;||pt2). Note that because 
of the random choice of ciphertext blocks, the resulting 
encryption key, nonce, and tag are indistinguishable from those 
of a genuinely uploaded file. 

2) Single Plaintext-Ciphertext Pair: In this second sce- 
nario, we assume that the adversary knows a plaintext- 
ciphertext pair (pt,ct) where ct = AES-ECB.Enc(kw, pt). 
Given this, the adversary can forge a key ciphertext that 
decrypts to a node key of all zero bytes. The forgery is 


possible due to the structure of obfuscated keys combined 
with AES-ECB encryption. The adversary chooses ct||ct as 
encryption for the obfuscated file key. By construction, this 
key ciphertext decrypts to ke?’ = ptj||pt, since AES-ECB 
decrypts the two blocks independently and with the same 
key ky. For kr, Nr, Tineta <- DeobfKey(ptl| pt) a gives 
Ne||Tmeta = = pt and kp = = pt® (NeF|| Tmeta) = = ol? 

Note that this works regardless of the plaintext content of 
the AES blocks. Hence the attacker can use any plaintext- 
ciphertext pair (pt, ct). The decryption and deobfuscation of 
the key ciphertext will always succeed since node keys are not 
integrity-protected. 

3) File Reconstruction: The adversary now has a node 
key kp, a nonce Np and a metamac Tineta and wants to 
forge a ciphertext for a file F’ such that it verifies under the 
tag Teta. On a high level, the adversary achieves this by 
working backward from the metamac and inserting a single 
AES block at a convenient location in the malicious file F. 
Note that many standard file formats such as PNG and PDF 
tolerate 128 injected bits (for instance, in the file’s metadata, 
as trailing data, or in unused structural components) without 
affecting the displayed content. Hence the modified file F’ can 
be constructed to appear identical to F. 


Figure 6 visualizes the construction of F’ given kp, Nr, 
and Tmeta. The dark orange blocks are fixed and imply 
constraints that must be satisfied for the forged file to pass 
the integrity verification. The adversary starts by creating a 
condensed MAC Tiong that produces Timeta. For this purpose, 
it splits Trneta into two 32-bit chunks T}.,,, and T?,.,,- Next, 
it chooses 71,73 <s {0, 1}8? u.ar. and sets 72 + 71 ® Thora 
and 74 + 73 ® T? 4, This ensures that the condensed MAC 
Teond € 71||72||73||74 produces Tineta when the metamac is 
computed. 

Next, the file chunk MAC tags must be set to ensure that 


the condensed MAC is Teona. Let Fi ||Fo|| ... ||Fn << F be the 


file chunks of the adversarially chosen file, each consisting 
of Ip AES blocks. Recall from Figure 2 that Tong is the 
CBC-MAC of the concatenation of all n chunk MACs T; for 
i € [1,n]. Because of the structure of CBC-MAC, all but a 
single chunk MAC tag can be chosen freely to give the desired 
Tcond- The last tag can then be reconstructed from the other 
tags and T.onad. Hence, to proceed, the adversary selects a 
chunk index j € [1, n] such that the file format of F tolerates 
128 random bits (aligned to AES blocks) in the j-th chunk. 
Then, it calculates all chunk MACs except T’; using MEGA’s 
AES-CCM* encryption (cf. Figure 3): 


For all ¢ € [1, n] \ {7} do: 
[Filkes Tie AES-CCM*.Enc(kp, Np|| (2 * Ir), F;) 


Next, the adversary computes the condensed MAC tag 
for chunk j by applying a “meet-in-the-middle” CBC-MAC 
calculation, to ensure that the result of CBC-MAC over all the 
chunk MACs is Teona. That is, it computes the intermediate 
condensed MAC Tyond,j—1 up to chunk 7 (the “forward di- 
rection”) by calculating the CBC-MAC of T)||To]| . .. || Zj—1. 
It also computes the intermediate condensed MAC values for 
chunks 7 + 1 to n backward, starting from the desired output 
Teond,n <- Teona. That is, fori = n—1,n—2,...,7 let 


Teond,i - AES-ECB.Dec(kr, Teond,i+1) iS) Ti41- 
The remaining tag T; is defined by Tyona,j—1 and Teona,j: 
T; -— Deondg 1 @ AES-ECB.Dec(kr, Tas) 


Now, the MAC tag T; for file chunk 7 is fixed. For 
analogous reasons as above, the adversary can construct and 
insert a single AES block into F’; such that the MAC of the 
resulting file chunk E is T’;. The adversary then sets 


FY Fill... |/Fj-al|F[/Fisall--- [Fn 


and generates the file ciphertext Cr by encrypting the file 
chunks with the key kp. The adversary can also encrypt file 
attributes of its choice using AES-CBC with key kp. Note 
that any attributes may be chosen and that no modification 
is necessary since file attributes are not integrity-protected by 
MEGA. Finally, the attacker places the key ciphertext (either 
ct,||ctz for the two randomly chosen ciphertext blocks in the 
decryption oracle scenario, or ct||ct for the known plaintext- 
ciphertext pair (pt,ct)), the file ciphertext and the attribute 
ciphertext in the victim’s cloud storage. 


C. Impact 


With either attack, the adversary is able to add a new file 
to the user’s cloud. The file can be chosen by the adversary, 
up to one AES block in a flexible location. The impact of this 
fixed block is small in practice since many file formats tolerate 
sufficiently long sections of arbitrary bytes. 

In MEGA’s threat model, the expected file integrity protec- 
tion is that only the user can upload files to their storage due to 
the client-side user-controlled encryption. Hence an adversary 
exploiting these vulnerabilities can frame a user for possession 


of incriminating files, that, in theory, only the victim could be 
the creator of. For example, a conceivable attack might frame 
someone as a whistle-blower and place an extensive collection 
of internal documents in that person’s account. 


D. Complexity 


The attacks require only a trivial amount of computation. 
The file reconstruction solely uses simple bit operations and 
fast AES block cipher applications. If the decryption oracle is 
instantiated with the plaintext recovery attack from Section IV, 
the attack needs a single user login attempt. The second 
integrity attack does not require additional effort. 


VI. GUESS-AND-PURGE BLEICHENBACHER ATTACK 


In this section, we present a new Guess-and-Purge variant of 
Bleichenbacher’s attack [4] (GaP-Bleichenbacher) to decrypt 
RSA ciphertexts using a padding oracle exposed by the fall- 
back chat key exchange for MEGAchat. The attack is adapted 
from PKCS#1 v1.5 padding to the custom padding scheme 
used by MEGA clients for RSA encryption of chat keys when 
no Curve25519 key is available. Our attack devices a new 
strategy that guesses the unknown two-byte prefix tolerated 
by this padding scheme and quickly purges wrong guesses. 
The overall GaP-Bleichenbacher attack requires 2'°-° client 
login attempts on average to decrypt one ciphertext. 

Although this attack is weaker than the RSA key recovery 
in Section III (in the sense that a key recovery implies 
plaintext recovery), it is complementary in the vulnerabilities 
that it exploits and hence requires separate countermeasures. 
Additionally, the Bleichenbacher attack may be performed by 
a weaker adversary. 


A. Threat Model and Adversary 


The threat model assumes an adversary with chosen- 
plaintext capability for the RSA encryption used as a fallback 
chat key exchange. The adversary needs a channel to the 
victim over which it can send encrypted chat keys. The 
client throws different errors depending on whether the RSA 
decryption of the chat keys was successful or not. We assume 
that the adversary is able to observe this error oracle. 

We outline two possible realizations of this adversary. First, 
a malicious service provider can send any message encrypted 
with the target’s RSA public key to the user disguised as 
encrypted chat keys. The client reports a different error mes- 
sage to the server when RSA decryption fails and when chat 
message decryption fails (because random bytes are used as 
chat key after a successful RSA decryption). 

Second, another user who has a direct chat with the victim 
may execute the same attack by sending maliciously chosen 
messages instead of chat keys during the key exchange. We 
consider it possible that such an adversary can infer the target’s 
decryption success. Error messages that the target sends to 
the server may be forwarded to the chat partner to inform the 
sender that transmission failed. Otherwise, the adversary could 
observe the encrypted network traffic between MEGA and the 
target’s client and distinguish error messages sent to the server 
based on timing and message sizes. 


B. Attack Outline 


Recall that Bleichenbacher’s attack [4] on PKCS#1 v1.5 
padding maintains an interval of possible plaintexts. It exploits 
the malleability of RSA to test the decryption of multiples 
of the unknown target message. Successful unpadding leaks 
the prefix of the decrypted message due to the structure of 
the PKCS#1 v1.5 padding. This prefix allows an adversary 
to reduce intervals efficiently and recover the plaintext. 

MEGA’s padding scheme has an unknown prefix of two 
bytes that prevents the direct application of Bleichenbacher’s 
attack. Every successful decryption corresponds to many dis- 
joint solution interval candidates leading to a state explosion. 

Our attack guesses the unknown prefix and removes it 
before updating the solution interval. We find empirically that 
wrong guesses lead to an empty solution interval within a few 
iterations of the GaP-Bleichenbacher attack steps. By using 
practical optimizations — including dynamic programming and 
early termination — we can avoid both repeating queries and 
spending time to recover padding bits. 

We provide a detailed description of GaP-Bleichenbacher, 
including the intricate adaption of Bleichenbacher’s attack 
steps, in Appendix B. 


C. Complexity 


Our experiments evaluated the attack to require 2!°-9 queries 


on average, where 25% of all runs need less than 214 queries, 
and the distribution has a long tail. We provide further details 
in Appendix B. 


D. Impact 


An adversary can use a vulnerable client to decrypt any RSA 
ciphertext due to the reuse of a single RSA key pair. Hence, 
the attack enables the recovery of chat keys transferred using 
RSA, as well as node keys shared with the victim. 


VII. PROOF OF CONCEPT 


We set up a test account on MEGA and implemented a 
PoC of our attacks to test them in practice.'* Each PoC is 
implemented in two settings: sim and real. 

In sim, the entire attack is run against a local simulation 
of the relevant parts of MEGA to avoid affecting MEGA’s 
operation. In real, we verify that our simulation accurately 
models MEGA’s system by carefully testing the components 
of the attack on the MEGA web client v. 4.11.2.'° Since the 
server code is not published, we cannot implement a PoC 
where the adversary controls MEGA. Instead, we implement a 
MitM attack by installing a bogus TLS root certificate on the 
victim. This setup allows us to impersonate MEGA towards 
the user while using the real servers to execute the server code 
(which is unknown to us). We can patch server responses and 
perform our attacks on the fly since they do not rely on secrets 
stored by the server. 


'4The implementation of our PoCs is published on GitHub: https://github. 
com/MEGA- Awry/attacks-poc. 

'5Since we exploit fundamental flaws in the cryptographic architecture, we 
expect the vulnerabilities to apply to other clients and versions as well. 


HACKING «=f 


hs,” MEGA . 
mat oe et ‘ 
3003f830 30 30 6a b4 eb e3 00 00 00 60 49 45 de 44 ae 42 00}. 


POO3F840 60 82 00 00 00 OO 00 OO OO 00 0H 00 OO 00 00 BO |. 
SelckatbiomeoO b2 3b 6d 3e do ff da af 43 f7 ab 57 66 8f 3° 


..-TEND.B 


Fig. 7. Forged file with 128 chosen bits after IEND, the last PNG chunk. 


For our RSA key recovery attack from Section III, we ran 
the full binary search and simple lattice optimization described 
in Appendix A in sim to ensure that the attack succeeds 
reliably in 683 login attempts. The lattice recovery of 341 
missing bits of the prime succeeded in 1000 out of 1000 runs. 
Afterwards, we recovered the first and last few bits of the RSA 
prime factor in the real setting. This way, we were able to 
verify the correctness of the attack while avoiding having to 
perform excessively many login requests on MEGA’s servers. 


Our plaintext recovery attack from Section IV only requires 
a single login query to recover a node key when the RSA 
private key is known. Therefore, we implemented a full PoC 
in both settings. We investigated the internal state of MEGA’s 
command-line client to obtain the private key material of our 
test account and the expected node key decryption. 


Using the integrity attacks from Section V, we successfully 
forged a valid ciphertext for the PNG image shown in Figure 7. 
The PNG file format ignores any data appended to the image, 
making it simple to modify the image to add the necessary 
block needed to pass the integrity verification. In sim, we 
verified that our reconstructed MEGA file decryption success- 
fully recovered the forged file. For real, we implemented the 
attacks only on the client-side to avoid uploading persistent 
bogus material to MEGA. We injected the file in the file tree 
hierarchy fetched by the web client and then intercepted the 
load request caused by the user when opening our forged 
image in the MEGA image viewer. Then, we served our forged 
file and verified that it displayed correctly on the client. 


For the fallback chat key transfer over RSA that the 
GaP-Bleichenbacher attack from Section VI targets, source 
code comments suggest that it is only used for accounts 
registered before 2016. Downgrade attacks for newer accounts 
cannot be ruled out but we did not attempt to implement 
this attack in the real setting due to the closed-source 
server code and the substantial number of queries. Instead, 
we only implemented it in sim to show that MEGA’s custom 
padding scheme is fundamentally vulnerable to an adaption of 
Bleichenbacher’s attack on PKCS#1 v1.5 padding. 


VIII. COUNTERMEASURES 


As part of our disclosure to MEGA, we detailed three sets 
of countermeasures: immediate patches, suggesting backward- 
compatible mitigations to temporarily protect against the most 
severe consequences of our attacks, minimal patches, provid- 
ing more robust protection while avoiding expensive opera- 
tions like re-encryption of all user files, and recommended 
measures, proposing steps toward a redesign of MEGA’s 
cryptographic architecture. Figure 8 shows an overview of 
the proposed changes to the key hierarchy for each set of 
mitigations. 


A. Immediate Countermeasures 


1) Integrity-Protect Key Ciphertexts: The most effective 
countermeasure against our attacks is to add integrity protec- 
tion for the encrypted user keys stored by MEGA. This can be 
done in a non-invasive way by adding HMAC tags to the key 
ciphertexts. By extending the existing encryption, older clients 
can ignore the new authentication tags and remain functional. 
We advise the use of distinct keys for separate usages of 
HMAC, rather than re-using the master key, to avoid further 
vulnerabilities from the lack of key separation. We refer to a 
key used for this purpose as a “key integrity key” (KIK) in 
Figure 8. 

This measure directly protects against the RSA key recovery 
in Section III. Consequently, it also prevents our plaintext re- 
covery and integrity attacks (Sections IV and V) because they 
build on the RSA key decryption and rely on the lack of key ci- 
phertext intergrity. However, this measure should only be con- 
sidered a temporary patch, since AES-ECB-then-HMAC still 
does not achieve authenticated encryption security. Neverthe- 
less, we propose this as a temporary solution due to MEGA’s 
challenging scale, the urgency of the issues, backward com- 
patibility considerations, and the ease of implementation. 

2) Separate Keys: MEGA broadly violates the principle of 
key separation: the practice of using separate keys for separate 
purposes. The most notable instance is the reuse of the master 
key to encrypt all other user keys, enabling the AES-ECB 
plaintext recovery attack in Section IV. As an immediate 
measure, we propose to replace the master key with a new, 
randomly chosen key derivation key, kp, and to use HKDF 
to derive a set of key encryption keys (KEKs) from kp to 
separately encrypt the share, chat, sign, and node keys. 

This measure offers additional protection against the 
AES-ECB plaintext recovery attack: the RSA private key 
decryption can no longer be used to decrypt other encrypted 
keys, since they are encrypted with distinct KEKs. However, 
users should also change their passwords after this patch 
is implemented, as a proactive measure to render the old 
master key kjy inaccessible. Without a password change, the 
encryption key k, remains the same and can still decrypt ky,. 
Thus, if a user could be tricked into decrypting the master 
key ciphertext (for instance, by using an outdated client), our 
attacks could still be performed by a malicious entity that 
stored earlier versions of key ciphertexts. Nevertheless, even 
users who do not update their password would still benefit 


from the proposed key separation, as the AES-ECB plaintext 
recovery attack could no longer be used to compromise the 
keys of newly uploaded files. 

3) Use a Stricter RSA Padding Format: We propose to 
enforce stricter client-side checks on MEGA’s custom RSA 
padding to increase the number of queries needed for the 
GaP-Bleichenbacher attack. Enforcing a fixed 2-byte padding 
prefix would increase the number of queries needed for the 
attack to approximately 2°? because conforming messages are 
then harder to find. This modification is a short-term measure 
that does not remove the padding oracle. An attack requiring 
233 queries is still worrisome as it could be improved. Nev- 
ertheless, this measure would reduce the practicality of the 
attack and make it easier to detect. 


B. Minimal Countermeasures 


The minimal countermeasures address all of our attacks 
pragmatically. We propose to switch to more standard crypto- 
graphic primitives when it does not involve the re-encryption 
of large data volumes. In particular, this means that old 
clients ideally need to be deprecated, or, alternatively, that new 
and old client versions should be supported in parallel. The 
latter requires carefully designed protocols to avoid downgrade 
attacks. 

1) AES-GCM for Key Ciphertexts: In the long term, a stan- 
dardized AEAD scheme should be used to encrypt user keys. 
Hence we propose to replace the ad hoc AES-ECB+HMAC 
construction introduced in the immediate countermeasures 
with AES-GCM for the encryption of the private share, chat 
and sign keys as soon as possible. Ideally, the node keys should 
also be encrypted with AES-GCM, but we postpone this switch 
to the recommended countermeasures due to the conceivable 
complexity of such an operation. 

This measure addresses our attacks more adequately than 
the immediate measures do. It also simplifies the key hierarchy 
(see Figure 8) by removing all KIKs (except the one for node 
keys). However, AES-GCM needs to be used carefully to avoid 
issues with nonce reuse [34], cache side-channel attacks [35], 
[36], fragile authentication [37], [38] and attacks from lack 
of key commitment [13], [14], [15]. We advise MEGA to 
follow standard key-wrapping practices [39] and to use the 
key’s purpose and, for asymmetric primitives, the public key 
as associated data to authenticate control data and avoid key 
confusion attacks. 

2) RSA-OAEP and separate RSA Keys: We suggest the 
use of RSA-OAEP [40] for RSA encryption to protect 
against Bleichenbacher-style attacks on the padding. We 
further recommend adding an additional RSA key pair 
(ski8*Y | pk'8*°Y) for the legacy chat key exchange to sep- 
arate the two uses of RSA encryption. For compatibility 
reasons, MEGA’s custom padding scheme may need to be 
preserved in this legacy code, in which case this part of the 
code will still be vulnerable to the GaP-Bleichenbacher attack. 


C. Recommended Countermeasures 


In this last set of measures, we discuss long-term goals 
for a cryptographic refactoring of MEGA’s architecture to 
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adequately address all of our attacks and protect against future 
ones. The measures proposed in this section no longer retain 
backward-compatible but insecure functionality. For instance, 
we propose to remove the legacy chat key transfer and, thus, 
require users of deprecated clients to update. However, apart 
from being disruptive, the measures stay in line with the 
current cryptographic architecture and propose a secure way 
to implement it without more changes than necessary to the 
design or functionality. 

1) File Key Encryption: Our general recommendation is to 
use authenticated encryption with associated data to protect 
user keys (and files), and to use distinct keys for distinct 
purposes. In particular, when the design of the system changes 
to introduce updates or new features, new keys should be used 
for new functions to avoid vulnerabilities from legacy code. 

Concretely, we suggest to replace the ad hoc integrity 
protection of node key ciphertexts introduced in the immediate 
countermeasures by AES-GCM. Additionally, instead of using 
the node keys directly, we recommend using HKDF to derive 
separate keys for file encryption, file attribute encryption and 
to compute the condensed MAC tag. We also advise against 
the use of MEGA’s “obfuscated” node key format. Lastly, we 
recommend regular key rotation for all key-encryption keys. 

2) File Encryption: As previously noted, the file encryption 
scheme used by MEGA is a variant of AES-CCM which does 
not conform to the standard [27] since it does not encrypt the 
authentication tag. We again propose to replace it with the 
more widely adopted and efficient AES-GCM. Furthermore, 
we advise to use CMAC [41] to compute MACs (such as the 
metamac) over variable-length messages. 

3) Augmented PAKE for Authentication: As noted in Sec- 
tion I-B, MEGA implements an unusual authentication pro- 
cedure that is susceptible to dictionary attacks: The client 
transmits the authentication key k,, which is half of the output 
of PBKDF2-HMAC-SHA512 on the user password, to the server. 
A MitM adversary capable of breaking TLS can observe k, 
and the salt, and try different passwords to find a match. 
Although publishing a partial output of PBKDF2 is not a 
violation of the PKCS#5 v1.2 standard, [42] showed that such 
attacks on PBKDF2 are feasible on custom hardware and GPUs. 

We propose to replace the current authentication with 
OPAQUE [43], an augmented Password Authenticated Key Ex- 
change (PAKE) protocol recommended by the Crypto Forum 
Research Group [44]. This construction removes the need for 
an authentication key in MEGA: a user can directly authenti- 
cate by proving their knowledge of the password to the server. 
Meanwhile a MitM adversary capable of breaking TLS cannot 
gain any advantage in recovering user passwords, even through 
carrying out pre-computation prior to a server compromise. 
Note, however, that since our threat model considers the 
MEGA server to be adversarial, the version of OPAQUE in [43] 
in which the server does not learn the user password during 
registration should be used. Note also that even using this 
version, MEGA can still mount dictionary attacks against 
individual user passwords, by testing password guesses against 
server-side data needed in OPAQUE. 


After implementing OPAQUE, MEGA should request users to 
change their password. Otherwise an adversary that stored k, 
from a previous authentication can still perform a dictionary 
attack. 


IX. DISCUSSION 


MEGA is not the first — and almost certainly not the last — 
system to contain critical security vulnerabilities. While secu- 
rity analysis like ours will remain important in establishing and 
improving the privacy of users of cryptographic systems, it is 
unfortunate that such attacks continue to be discovered. When 
a system has grown popular enough to attract the attention of 
independent researchers, skilled adversaries may have already 
compromised the system. Mitigating attacks cannot undo the 
consequences of such compromises. Additionally, the process 
to patch a large-scale system like MEGA is at best cumber- 
some, at worst impossible. 

The problem of bridging the knowledge gap between cryp- 
tographers and implementers is a long-standing one, and 
beyond the familiar advice to stick to well-tested implementa- 
tions of standardized and provably secure primitives, we will 
not discuss it further in general here. Rather, we choose to 
highlight some specific lessons learned from our review of 
MEGA and give recommendations for the future. 


A. How and Why MEGA’s Design Fails 


The attacks presented in this work arise from unexpected 
interactions between seemingly independent components of 
MEGA’s cryptographic architecture. They hint at the difficulty 
of maintaining large-scale systems employing cryptography, 
especially when the system has an evolving set of features and 
is deployed across multiple platforms. The challenges involved 
in a complete redesign of a cryptographic architecture can 
make ad hoc fixes and short-term solutions attractive. In turn, 
this can lead to even more complexity that becomes more dif- 
ficult to maintain, due to the introduction of new dependencies 
and the desire to provide backward compatibility. 

As an example, our recommended transition to authenticated 
encryption in MEGA would require all customers to download, 
decrypt, re-encrypt, and upload all their data, due to the end- 
to-end security features of the system. With 1000 PB of data 
stored by MEGA, this would take more than half a year at 
MEGA’s peak bandwidth of 1000 Gbit/s. It would also place 
an immense load on MEGA’s storage infrastructure. Perhaps 
because of challenges like this, MEGA decided to deploy a 
more short-term approach, leading to additional complexity. 

Hence, a design that anticipates cryptographic updates and 
allows new features to easily be added without introducing 
cross-domain vulnerabilities is crucial. A core feature of such 
a design is good key hygiene. Careful key separation not 
only minimizes the risk for unintended and dangerous interac- 
tions between cryptographic components, but can also protect 
against downgrading attacks and vulnerabilities in legacy code. 

Another central issue in MEGA’s design is the lack of 
consistent provision of integrity for ciphertexts: MEGA at- 
tempted to provide integrity for stored files, but not for the 


keys used to protect those files. This gave rise to a complete 
breach of confidentiality of user data. We hypothesize that 
this distinction in how integrity is provided arises from a 
misunderstanding concerning the strength of the relevant threat 
model for the analysis of MEGA. By now it seems to be well- 
understood that both confidentiality and integrity are needed 
when securing data at rest, but perhaps this is not so obviously 
true when securing keys at rest when faced with a malicious 
service provider. Some practitioners may also have the impres- 
sion that security notions for authenticated encryption assume 
unrealistically powerful adversaries. However, as our attacks 
on MEGA show, (partial) decryption oracles can exist in prac- 
tice, especially in the setting of a malicious service provider. 
We observe that RSA-CRT is particularly vulnerable to key 
overwriting attacks in a chosen-plaintext setting since the 
decryption directly uses the prime factors of the RSA modulus, 
potentially leaking useful information for factorization. Here, 
establishing the use of AEAD as the default is an important 
step in reducing the potential for attacks. 


B. Consequences 


Besides the effort and computational power required to 
patch a large system, vulnerabilities like the ones presented 
in this paper can have dire consequences for the system’s 
users. The attacks presented here show that it is possible 
for a motivated party to find and exploit vulnerabilities in 
real world cryptographic architectures, with devastating results 
for security. It is particularly concerning that services like 
MEGA - which advertise privacy as a core feature and hence 
particularly attract users in need of strong protection — fail to 
withstand cryptanalysis. It is conceivable that systems in this 
category attract adversaries who are willing to invest signifi- 
cant resources to compromise the service itself, increasing the 
plausibility of high-complexity attacks. Moreover, the cost of 
finding and exploiting such vulnerabilities is amortized by the 
large number of accounts to which they can be applied. 

Once the system has been compromised, recovering security 
(and trust) may be challenging even if the vulnerability is 
discovered and countermeasures applied. Ideally, users should 
be able to regain security after a compromise by updating their 
key material, for example by resetting their password. How- 
ever, even with such defensive mechanisms in place, in-depth 
insight into the vulnerability would be needed for users to 
assess what security guarantees remain, and how well patches 
protect their old and new data. Of course, we cannot expect 
consumers to make such assessments, and they may very well 
lose trust in the provider or fail to (for example) reset their 
password because they cannot judge the security implications. 


C. Future Work 


Given the popularity of cloud computing in general, and 
outsourced storage in particular, it is safe to assume that the 
demand for secure and private cloud services will continue to 
rise. Rather than leaving the task of designing a secure system 
to individual providers — which, paradoxically, would require 
users to trust the (by assumption) untrusted cloud provider 


with a secure design and correct implementation — we advocate 
for a standardization of secure cloud storage. 

Such a standard would ideally provide a secure and robust 
foundation obviating the need for ad hoc designs, while still 
leaving room for vendor-specific customization and improve- 
ments. For instance, support for additional features should be 
provided by design, and the specification should be crypto- 
graphically agile. Developing a good standard would require 
deep engagement with a broad spectrum of stakeholders, 
including but not limited to cryptographers. We believe that 
this would be the easiest path to avoid attacks stemming from 
the lack of expert knowledge among developers, and that it 
would enable users to finally have confidence that their data 
remains just that — theirs. 
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APPENDIX A 
LATTICE CRYPTANALYSIS FOR RSA KEY RECOVERY 
ATTACK 


We briefly describe the straightforward application of a 
low-dimensional lattice attack adapted from Section 4.2.2 
of Gabrielle and Heninger [31]. The attack can be used to 
recover up to 341 of the unknown least significant bits of the 
RSA prime factor q after the most significant bits have been 
determined using the key recovery attack in Section III. 

Let q2 be the leftmost (1024 — I) bits of q and qi the 
remaining | unknown bits. Let qg < q2 - 2! such that q = 
q2+qi. On a high level, we rewrite the problem of recovering 
qi to the task of finding small roots of a polynomial. For this 
purpose, we consider the following three polynomials fj, fo, 
and f3 over Z, which all have the small root q; modulo q: 


f,(x) = x-(q2 +x), 
fo(x) = qa +x, 
f3(x) = N, 


fi(q1) = 1 - (G2 + G1) =q 0 
fo(q1) = q2 + qi =q 0 
f3(qi) =N =q 0 


We observe that every linear combination of these polynomials 
has the same root q; modulo q. We use this observation to 
construct the following lattice basis B, where we put the 
coefficient vectors of the previous polynomials in the rows 
and scale the first column by L? and the second by L for 
La 


Ll? Lqe 0 
B=|0 L @ 
0 O N 


The column scaling ensures that the L1 norm of any vector 
in the lattice is an upper bound on the value of the unscaled 
polynomial corresponding to the coefficient vector when eval- 
uated at q:;. Consequently, if we can find a vector w with 
||w||1 < qin the lattice, then there is a corresponding unscaled 
polynomial g, such that g(qi) < q. Since qi is a root of g 
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modulo q by construction, it follows that g(qi) = 0 over the 
integers. 

We can efficiently recover the missing bits q; of q by 
factoring the polynomial. The LLL algorithm [45] finds an 
exponential approximation of the shortest vector in polynomial 
time. Nguyen and Stehlé showed in [46] that the vector w 
found by LLL satisfies ||w||2 < 1.02"det(B)'/" on average 
for random lattices. For our lattice basis B with dimension 
n = 3 and determinate det B = L® - N, we derive as the 
condition for ||w||1 <q that ] < logy(N1/°). We can therefore 
recover up to | = 341 bits for RSA-2048 using this simple 
lattice. Higher-dimensional lattices would allow us to decrease 
the number of queries to 512 as shown in [32], [33]. 


APPENDIX B 
GUESS-AND-PURGE BLEICHENBACHER ATTACK: 
DETAILED DESCRIPTION 


We first specify MEGA’s custom padding and the oracle 
that it exposes. Then, we extend Bleichenbacher’s attack steps 
for PKCS#1 v1.5 padding [4] to work on MEGA’s custom 
padding despite an unknown prefix. We conclude by analyzing 
the correctness and complexity of our GaP-Bleichenbacher 
attack. 


A. MEGA’s Padding Scheme 


When no long-term Curve25519 chat key is available, 
MEGA uses RSA encryption as a fallback method to exchange 
one or more 16-byte chat keys, concatenated together to K. 
The encryption procedure applies the following padding: 


MEGA_PAD(K) := tl|L||K|P, 


where ¢ is a two-byte prefix, L encodes the byte length of the 
chat key(s) K in two bytes (with big-endian encoding), and 
P is random padding that extends the message to 256 bytes. 
The server uses t + 01°. 

The client parses the above message after RSA-2048 de- 
cryption as follows. First, it removes and ignores the prefix 
t. Second, it recovers the key length and checks that it is 
a multiple of 16, the byte length of a single chat key. If this 
check succeeds, the client extracts the chat key(s) and discards 
the padding. Otherwise, it throws an exception and reports the 
error to the server. 

The padding removal is successful iff L is of the form 
0°||{0, 1}4||0*. Our GaP-Bleichenbacher attack mainly uses 
the zero prefix. The rightmost four zero bits of L can poten- 
tially be used for further optimizations. 


B. Attack Description 


This section explains our extension of the original attack 
steps from [4] to account for MEGA’s leakage pattern and the 
unknown prefix. 

To stay close to the notation of [4], let c = m° mod N be 
the RSA ciphertext of a target message m. Let B + 2567°? be 
the power of two that exceeds the largest possible unencoded 
plaintext by one. We call a message conforming when it is 
correctly padded. Let mg be the conforming multiple of the 
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Fig. 9. Visualization of the Guess-and-Purge strategy for our modified 


Bleichenbacher attack with T = 21° prefix guesses t € {0,1}!°. The blue 
path up to iteration i corresponds to a workload w = (M ity Hi,t;). where 
M,,4, is the solution interval implicitly associated to the path that depends on 
the previous choice of multipliers (So,e; S1,€,52,to;-++5 Si,t;) stored in Hj ¢,. 
Paths with empty intervals (marked by a cross) are not extended. 


target plaintext. If mo is of known byte length 1,,,, then mo 
has to be in the interval [Im, -.B, (Im) +1) -BW— 1] due to the 
padding scheme. 

The execution of our attack can be visualized as a tree 
(see Figure 9), where nodes represent a multiplier and the 
corresponding prefix guess. Every node has an associated set 
of intervals of candidate decryptions of our target ciphertext 
c. If the prefix guesses on the path to this node were correct, 
then there is some interval that contains the decryption m 
of our target ciphertext. In every iteration, we select a new 
multiplier for every leaf node. Each multiplier has T = 2'° 
possible prefixes. We add new successor nodes if the interval 
of possible plaintext decryptions is still non-empty. Otherwise, 
we know that there was a wrong prefix guess on this path, and 
we no longer extend this path in the next iteration (marked 
with a cross in Figure 9). The multipliers selected for different 
leaves may differ since they depend on the previous multipliers 
and prefixes. 

We introduce the concept of workloads for our variant of 
Bleichenbacher’s attack. A workload corresponds to a path 
in the attack tree. In other words, it stores the state of one 
possible solution path, including all prefix guesses that led to 
the current state and a set of intervals. We formalize this as 
follows. Let Hit; = (S0,t);51,t,,---,Si,t;) denote the history 
of multipliers on the path. The guesses t; € {0,1}1° for all 
j € (0, i] are the leftmost two bytes of s;,4, - mp mod N after 
left padding the result with zero bytes to 256 B. Let Mj, 
denote the set of closed intervals after iteration i, resulting 
from the choice of multipliers. We define a workload w 
to be the tuple (Mzj4,,Hi,t,). Furthermore, we denote by 
to = t; =e that there is no prefix guess for the first two 
multipliers. 

As explained in detail below, the multipliers are chosen in- 
dependently of any prefix guess: Step J chooses sp ,4, randomly 
and Step 2.a linearly searches for s,,¢,, starting from a value 


derived from the initial bounds. For k € [2, i], Step 2.c chooses 
the multiplier s, 4, € Z based on the shifted prefix guess 
ye + 2567-B-t, of the conforming message sj ¢,°™Mgq mod N 
and the previous intervals M;,_14,_, Such that s;, ,, reduces 
the size of the possible solution intervals adequately. We guess 
the shifted prefix y; before selecting sj ¢,. 

We remark that our indices for multipliers and prefix 
guesses are only unique within the same workload. We do not 
use globally unique identifiers to avoid a cluttered notation. In 
particular, the multipliers and prefixes for different workloads 
in the same iteration do not have to be equal. 

Finally, we introduce the oracle O,,(s) which returns true 
iff the RSA chat key decryption succeeds for the ciphertext co 
multiplied by s° mod N. 

For our GaP-Bleichenbacher attack, we perform Step 1 
once at the beginning of the attack. For every itera- 
tion i, we perform Step 2 to Step 4 for every workload 
w = (M(j-1,4,_,, Hi-1,4_,) © Wi-1. 


Step 1: Blinding. Given a target ciphertext c = m° mod N, 
we sample random multipliers so. until O.,(so,<) 
returns true. For the first successful value so, we 
set: 

co + (c+ (S0,<)°) mod N 
Mo,e — 1 Pitas DE cars _ 1}} 
Ho,¢ = (So,¢) 
Wo = {(Mo,; Ho,<) } 
i<l 
where Ptmin <— 0 and Pptmax <— 241-B are the small- 
est resp. largest possible plaintexts (including length 
encoding) which conform to MEGA’s padding. The 
subsequent attack recovers mo + (sp,-- 1m) mod N, 
which is the decryption of co. We do not need any 
prefix guess (as indicated by €) as Mo, contains 
a single interval specifying the maximum range of 
conforming plaintexts. 

Step 2: Searching for a multiplier satisfying O,,. 

Step 2.a: Starting the search. For i = 1, we have only 
a single workload with Mo, = {[a, b]} (in the 
generic case, a = 0 and b = 241-B— 1). We 
search for the smallest s;,.. > N/(b+ 1) such 
that O.,(si,<) returns true. 


Step 2.b: Sequential searching with |Mi-1,4_,| > 1. For 
i > 1 and more than one interval left, where 
Si-1,t,_, © Hi-1,4,_,, we search for the smallest 
Sit; > Si-1,t,_, Where O., (si,r,) returns true. 

Step 2.c: Interval-based searching with |M =e =1. 


For i > 1 and exactly one interval [a, b] left, 
where sj-1,4,_, € Hi-1,t;,_,, we iterate over all 
possible prefix guesses t; € {0,1}'© with the 
corresponding shifted values y; < 2567 - B- t; 
of the still unknown value s; 4, - mo mod N. For 
every prefix guess, we search the smallest pair of 
variables r; and s;,;, which satisfy the following 


two constraints as well as O,,(s;,z;). Due to the 
choice of r; and s;,4,, we approximately halve the 
interval [a, b] in Step 3. 

We start incrementing r; from 


= 2- b+ sj—1,t,_, — Ptmin — Yi 


i ea 
N 
For every r; value, we try the following multipli- 
ers: 
Ptmin + 1i-.N + yi < Ptmax +1i°N + yi 
b = Si, t; < - . 


Section B-C discusses the reasoning for this 
search procedure in detail. However, the intuition 
is as follows: there exists at least one solution 
because we are guaranteed to find a conform- 
ing s;,4, value for the workload with all correct 
prefix guesses since our procedure then performs 
Step 2.c from the original Bleichenbacher attack 
without an unknown prefix. If we end up using 
another multiplier s;;, for a wrong prefix guess, 
this still reduces our intervals. Although this s; ¢, 
might not eliminate as many plaintext candidates 
as the one for the correct prefix, it is still a 
correct multiplier because the oracle decision is 
independent of our prefix guess. 

Step 3: Narrowing the set of solutions. For all intervals 
[a,b] € Mj-14,_, and the multiplier guess history 
Hi-1,t;_,; = (S0,to,S1,t),--+>Si—1,t;_,), We update 
the bounds for every prefix guess t* € {0,1}?6 
of (si,4; - mo) mod N and the corresponding y* < 
2567 - Bt*. We update the intervals and prefix guess 
history as follows, where s; 4+ <— 5j,;: 


Mit — Ua,b,r {[a’, b']} 
Hit — (S0,to» S1jtis+++5Si-1,ti_1> Si,t«) ‘ 


In the above equations, the bounds a’ and b’ are 
specified as follows: 


bin -N 
ae max (a, | me +y }) 


Si,t; 
b’ © min (» | owe )). 


Sit 
for all r values in the following range: 


i 


a+ Sit; — Dtmax + 1—y* Jie b+ Sit, — Ptmin — y* 


N N 

We add a new workload (Mj i+, Hi,e«) to W; when- 
ever Mit» #0. 

It is necessary to consider all possible prefixes 
t* € {0,1}'° to guarantee the existence of a fully 
correct prefix guess history to, ti,...,t* (which is 
implicitly stored in Hj; +). For instance, if we would 
only use the prefix ¢; for which Step 2.c found the 
multiplier s; ;,, then the target plaintext mp might not 
be in any of the remaining intervals because the first 


two bytes t* of s;,4,- mo mod N are not equal to tj. 
Computing the solution. If there is only one work- 
load Wj-1 = {(Mi-1,4;_,, Hi-1,t,_,)} and only one 
interval M;_7;4,_, = {[a,a]} containing a single 
value, then we have a = mo and return the solution 
m+ a-(so.<) ! mod N. 

Otherwise, we continue executing the attack. If we 
did not yet iterate over all workloads in Wj_;, we 
go to Step 2 with the next workload from that set. If 
there is no workload left for iteration i, then we set 
i+ i+ 1 and continue with Step 2 for the new set 
of workloads Wj. 


Step 4: 
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Fig. 10. Density plot of the number of oracle queries. 


C. Correctness 


The extension of Bleichenbacher’s attack [4] to MEGA’s 
padding scheme is challenging because of the unknown prefix 
y;. We adapt the equations from [4] to account for y; as 
follows: 


O.,(s;) => dr € Z,At* € {0,1}'° such that 
Pluie = Bmp EN — 250". Be = pine — 1, (1) 


Let r € Z and t* € {0,1}'° with the corresponding shifted 
value y* «+ 2567- Bt* be values satisfying the right-hand side 
of the implication in Equation 1, for some s; such that O,, (sj). 
We solve the inequalities for mp to derive the bounds for mp 
used in Step 3 to narrow down the intervals: 

Ptmin tr-N+y 2 < Ptmx—1t+r-Nt+y" 


S$ MoS 
Si Sj 


(2) 


Furthermore, we can derive the bounds for r used in Step 3 
from Equation | by using a < mo < b for some interval 
[a, b] € Mi¢ and mpo € [a, b}: 
8;°a— Ptmax +1—y* Si * M9 — Pbmax t1—y" — 
N N = 
7 <Si Mo — Ptmin — ¥" < Si b= Ptmin~ y" 
~ N ~ N 


IA 
ie 


(3) 


The bounds for the multipliers used in Step 2 can be derived 
similarly. 

We can use the above statements to prove the correctness 
of our algorithm by induction over i. Let 


T(i) = (At; € {0, 1}°8) (Ga, b] € Mj.4,) s.t. mo € [a, 5] 


be our induction hypothesis stating that in every iteration, there 
exists at least one interval containing the target message mo. 
Since the last interval has length one, this implies that we find 
the correct plaintext mp and, thus, return the decryption m of 
the target ciphertext c. 

The base case T(0) trivially holds because Mo, 
{[Ptmin, Ptmax]} and mo € [Ptmin, Ptmax| by definition since 
Ptmin and ptmax are the smallest, respectively largest, plaintext 
values. 

We assume T(i— 1) for the induction step and show T(i). 
Step 2 uses some s; with O,,(s;) by construction. Therefore, 
by Equation | there exist r and t* € {0,1}'° such that the 
right-hand side of the implication holds. By Equation 3, we 
know that the range of r values used in Step 3 contains the 
correct one. Furthermore, we iterate over all t* € {0,1}1° 
and add intervals to Mj; +. Therefore, for the correct r and 
t*, we narrow |[a, b] to [a’, b’] in Step 3 where the bounds 
from Equation 2 guarantee that mo € [a’, b’]. We conclude 
the induction proof by noting that [a’, b’] € Mj,,4- implies 
T(i). 


D. Complexity 


The density histogram in Figure 10 shows that our 
GaP-Bleichenbacher has a query complexity of p ~ 216° 
on average with a comparatively high standard deviation of 
ao ~ 2!73. A quarter of all runs only require 2'+ queries, 
but the distribution has a long tail, and we aborted 71 out of 
1000 runs because they exceeded our cutoff of 10° queries. 
We use the Freedman-Diaconis binning rule to decide on an 
appropriate number of bins of equal width. 

The query complexity of our GaP-Bleichenbacher attack is 
significantly lower than executing 21° classic Bleichenbacher 
attacks for every prefix guess. Figure 11 visualizes the core 
reason: every conforming multiplier sj, allows us to detect 
workloads with an invalid prefix guess in 1; ;, because they 
result in an empty solution interval M; +, = @. The stacked 
bar plot shows that the first multiplier s;,- adds approximately 
2500 plausible prefix guesses. The next multiplier so ;, elim- 
inates more than 95% of the wrong guesses shown with a 
blue bar in Figure 10; the remaining workloads are gray with 
an error bar showing the standard deviation. As the solution 
intervals narrow, every multiplier adds fewer new workloads 
while following multipliers eliminate wrong guesses quickly. 
We do not require more queries than the classic Bleichen- 
bacher attack after approximately 28 multipliers because only 
a single workload remains. 

Our optimizations are another reason for this good per- 
formance compared to the original attack. We use dynamic 
programming to avoid repeated queries for different work- 
loads. Furthermore, we utilize that MEGA’s padding ends in 
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Fig. 11. Number of purged and remaining workloads for every multiplier. 


random bytes, which we do not need to recover. Therefore, 
we terminate the attack as soon as the interval of possible 
plaintexts has a stable prefix that includes all message bits. 


E. Conclusion 


The GaP-Bleichenbacher attack extends the original attack 
on PKCS#1 v1.5 padding to MEGA’s custom padding with 
unknown prefixes. We evaluated this variant to require 21°? 
queries on average despite guessing a two-byte prefix. The 
attack is still challenging to exploit in practice because it 
requires a substantial number of queries. The adversary is also 
challenging to instantiate in practice. Despite the theoretical 
nature of the GaP-Bleichenbacher attack, it still points out two 
weaknesses of MEGA’s system. First, implementing custom 
padding schemes instead of using provably secure standards is 
dangerous. Second, key reuse allows the adversary to decrypt 
arbitrary RSA ciphertexts using legacy code. 
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